A TALE-inspired computational screen for proteins that contain approximate tandem repeats
نویسندگان
چکیده
TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
منابع مشابه
C OMPUTATION AND A NALYSIS A thesis presented in partial fulfilment of the requirements
Biological sequences have long been known to contain many classes of repeats. The most studied repetitive structure is the tandem repeat where many approximate copies of a common segment (the motif ) appear consecutively. In this thesis, a complex repetitive structure is investigated. This repetitive structure is called a nested tandem repeat. It consists of many approximate copies of two motif...
متن کاملMolecular Dynamics Simulations of DNA-Free and DNA-Bound TAL Effectors
TAL (transcriptional activator-like) effectors (TALEs) are DNA-binding proteins, containing a modular central domain that recognizes specific DNA sequences. Recently, the crystallographic studies of TALEs revealed the structure of DNA-recognition domain. In this article, molecular dynamics (MD) simulations are employed to study two crystal structures of an 11.5-repeat TALE, in the presence and ...
متن کاملProgrammable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain
The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural...
متن کاملAn Algorithm for Approximate Tandem Repeats
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g., abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g., abcdaacd. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of lengt...
متن کاملThe effect of increasing numbers of repeats on TAL effector DNA binding specificity
Transcription activator-like effectors (TALEs) recognize their DNA targets via tandem repeats, each specifying a single nucleotide base in a one-to-one sequential arrangement. Due to this modularity and their ability to bind long DNA sequences with high specificity, TALEs have been used in many applications. Contributions of individual repeat-nucleotide associations to affinity and specificity ...
متن کامل